Goto

Collaborating Authors

 conditional swap regret


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper investigates the problem of online learning for minimizing a generalized version of swap regret. More precisely, the authors consider the notion of conditional swap regret, where the swap regret is defined for a stronger adversary than usual in the sense that the adversary's action depends on the past sequence of the player. In particular, when the memory size of the adversary is restricted to k, the regret is called the k-gram conditional regret. The authors propose prediction strategies with a k-gram conditional regret of O(\sqrt{N^k T log N}) and state-dependent regret bound, respectively. Moreover, using the conditional swap regret, the authors defines the conditional correlated equilibrium and shows a convergence result.


Conditional Swap Regret and Conditional Correlated Equilibrium

Mehryar Mohri, Scott Yang

Neural Information Processing Systems

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player's action history. We prove a series of new results for conditional swap regret minimization.


Conditional Swap Regret and Conditional Correlated Equilibrium Scott Yang Courant Institute and Google Courant Institute 251 Mercer Street

Neural Information Processing Systems

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player's action history. We prove a series of new results for conditional swap regret minimization.


Conditional Swap Regret and Conditional Correlated Equilibrium

Mohri, Mehryar, Yang, Scott

Neural Information Processing Systems

We introduce a natural extension of the notion of swap regret, conditional swap regret, that allows for action modifications conditioned on the player’s action history. We prove a series of new results for conditional swap regret minimization. We present algorithms for minimizing conditional swap regret with bounded conditioning history. We further extend these results to the case where conditional swaps are considered only for a subset of actions. We also define a new notion of equilibrium, conditional correlated equilibrium, that is tightly connected to the notion of conditional swap regret: when all players follow conditional swap regret minimization strategies, then the empirical distribution approaches this equilibrium. Finally, we extend our results to the multi-armed bandit scenario.